This vignette will teach you all about TFs and motiv enrichment analyses for your scATAC data! For example, you will perform DNA sequence motif analysis in Signac. For this, you will explore two complementary options for performing motif analysis: one by finding overrepresented motifs in a set of differentially accessible peaks, one method performing differential motif activity analysis between groups of cells.
This vignette is a modified version of this vignette.
As before, we first load all necessary R packages that are needed for this vignette.
suppressPackageStartupMessages({
library(Signac)
library(Seurat)
library(tidyverse)
library(patchwork)
library(chromVAR)
library(TFBSTools)
library(universalmotif)
library(motifmatchr)
library(BSgenome.Mmusculus.UCSC.mm10)
library(BiocParallel)
library(ggseqlogo)
})
set.seed(1990)
We begin by loading our pre-processed Seurat object from
the introductory QC vignette into R. This code is shared among all
subsequent vignettes.
# Specify the path to the "outs" folder from cellranger-atac here
# Make sure to have a trailing slash here
outFolder="/mnt/data/cellranger/outs/"
seu.s = readRDS(file = paste0(outFolder,"obj.filt.rds"))
# Make sure we have the ATAC array in the object
seu.s[["ATAC"]]
## ChromatinAssay data with 158583 features for 3401 cells
## Variable features: 119325
## Genome: mm10
## Annotation present: TRUE
## Motifs present: FALSE
## Fragment files: 1
# In case the object contains multiple assays, make the ATAC assay the standard
DefaultAssay(seu.s) = "ATAC"
SeuratFor motif analyses, a TF motif database is needed. Here, users have many choices such as JASPAR, HOCOMOCO, CIS-BP, TRANSFAC, SwissRegulon, or UniProbe.
HOCOMOCO is one of the major resources for sequence analysis of transcriptional regulation in mammals and a popular choice for DNA binding specificities for human and mouse transcription factors, and version 11 contains binding models for 453 mouse and 680 human transcription factors and includes 1302 mononucleotide and 576 dinucleotide position weight matrices, which describe primary binding preferences of each transcription factor and reliable alternative binding specificities.
In order to show you the power and flexibility of R as well as because we have good eperience with HOCOMOCO, we decided to choose this database for this vignette.
Generally speaking, importing the HOCOMOCO database into R would be easy if a designated HOCOMOCO package for R already existed (such as for JASPAR database), but unfortunately, this is not the case to the best of our knowledge. But with the power and flexibility of R, with a few lines of code, we can download the full HOCOMOCO database and all PWM motifs and import it into R without having to go to the terminal at all! Welcome to the world of bioinformatics! By not manually doing these tasks, we also eliminate many possible sources of error, and make this step as reproducible as possible.
As noted above, the following part may seem a bit custom. We have a
few tasks ahead of us: downloading, pre-processing, converting and
overlapping motif locations with our peaks, and finally adding the motif
annotation to our Seurat object. Let’s do it step for
step.
We will first download the whole HOCOMOCO PWM database (which includes many files), and we then convert the downloaded PWM matrices to proper motif objects in R. PWM stands for Position Weight Matrix and describes the probability to find the respective nucleotides A,C,G,T on each position of a motif. Thus, because motifs have different lengths, each TF PWM file has a different number of rows (one for each motif site) but always 4 columns (A, C, G, T).
# prepare motifs HOCOMOCO
file_motifs_rds = paste0(outFolder,"HOCO11MOUSE.pfmlist.rds")
if (file.exists(file_motifs_rds)) {
motifsProcessed <- readRDS(file_motifs_rds)
} else {
file_url = "https://hocomoco11.autosome.ru/final_bundle/hocomoco11/full/MOUSE/mono/HOCOMOCOv11_full_pwm_MOUSE_mono.tar.gz"
file_downloadLocation = paste0(outFolder, "HOCOMOCOv11_full_pwm_MOUSE_mono.tar.gz")
download.file(file_url, destfile= file_downloadLocation)
# First, just check contents without extracting
head(untar(file_downloadLocation,list = TRUE))
# Time to extract the files
untar(file_downloadLocation, exdir = outFolder, verbose = TRUE)
# for file in pwm/*pwm; do echo "" >> "$file"; done
pathToPWMs = paste0(outFolder, "/pwm/")
# We can now automatically save all file names into a vector and process it in R directly
pwmfiles = list.files(pathToPWMs, full.names = TRUE)
motif.list <- vector("list", length(pwmfiles))
i = 1 # Counter
for (fileCurrent in pwmfiles) {
# We now read in the PWM file and create a universalmotif object out of it
# We here suppress warnings that are not relevant for us for improved output
motivCur = suppressWarnings(universalmotif::read_matrix(fileCurrent, headers = ">", sep = "", positions = "rows"))
# We now have to convert the current class universalmotif to class "TFBSTools-PFMatrix
# We here suppress notes that are not relevant for us for improved output
motivConverted = suppressMessages(convert_motifs(motivCur, class = "TFBSTools-PFMatrix"))
# Save it in our list
motif.list[[i]] = motivConverted
# Increase our list counter
i = i +1
}
# We now have a populated list of all motif objects and are almost done.
# We now set proper names for the list, which we simply take from the parsed files
names(motif.list) = str_remove(basename(pwmfiles),".pwm")
# Finally, we can convert it to a PFMatrixList object
motifsProcessed <- do.call(PFMatrixList, motif.list)
saveRDS(motifsProcessed, file = file_motifs_rds)
}
Next, we need to do another step before we finished the motif integration. We now have to scan the DNA sequence of each peak for the presence of each motif and create a motif x feature matrix from a set of genomic ranges. his sounds complicated but can done with the help of just one function!
The following chunk takes a few minutes to compute, and we therefore
save the results to an rds file in the same way as we
stored the Seurat object into an rdsfile -
this allows us to compute the motif matrix only once and load it quickly
thereafter, as opposed to computing it every time we run this
vignette.
# Define a new file name that we try to load into R directly if it exists. If not, we create it
motif.matrix_name = paste0(outFolder,"motifmatrix.mouse.rds")
if (file.exists(motif.matrix_name)) {
motif.matrix = readRDS(file = motif.matrix_name)
} else {
# Let's create it once
# Exercise. What does use.counts do?
motif.matrix <- CreateMotifMatrix(
features = granges(seu.s[["ATAC"]]),
pwm = motifsProcessed, genome = 'mm10',
use.counts = FALSE
)
colnames(motif.matrix) = TFBSTools::name(motifsProcessed)
# Save it, so next time we can directly read it into R
saveRDS(motif.matrix, file = motif.matrix_name)
}
Finally, we have all the required components together and we can
create a Motif object and add it to our Seurat
object, similar to what we did before with the annotations.
# Create a new Motif object to store the results
motifsFinal <- CreateMotifObject(data = motif.matrix, pwm = motifsProcessed)
# Let's print the object summary
motifsFinal
## A Motif object containing 529 motifs in 158583 regions
# Add it to our Seurat object
Motifs(seu.s[["ATAC"]]) <- motifsFinal
chromVARWe can compute a per-cell motif activity score by running an approach called chromVAR, which identifies motifs associated with variability in chromatin accessibility between cells. This can identify differentially-active motifs between cell types. For more details, see the corresponding paper.
Identifying motif activities or TF motif enrichments can help us
predict which regulatory factors are most active in our cell type of
interest. Using chromVAR, we can predict enrichment of TF
activity on a per-cell basis from sparse chromatin accessibility data.
Its two primary outputs of are:
For proper calculation of the deviations scores and the underlying
statistics, one also needs a set of background peaks, which are chosen
using the chromVAR::getBackgroundPeaks() function which
samples peaks based on similarity in GC-content and number of fragments
across all samples using the Mahalanobis distance. Here, it is
called automatically when using the RunChromVAR
wrapper.
Depending on the number of cores, the following code will run 10-30 minutes or so, so please be patient.
nCores = 6
register(MulticoreParam(nCores, progressbar = TRUE))
# Compute the GC content, region lengths, and dinucleotide base frequencies for regions in the assay and add to the feature metadata. This is needed for chromVAR to work
# seu.s <- RegionStats(object = seu.s, genome = BSgenome.Mmusculus.UCSC.mm10)
# Now we can run the chromVAR wrapper
seu.s <- RunChromVAR(object = seu.s, genome = BSgenome.Mmusculus.UCSC.mm10)
## Computing GC bias per region
## Selecting background regions
## Computing deviations from background
##
|
| | 0%
|
| | 1%
|
|= | 1%
|
|= | 2%
|
|== | 2%
|
|== | 3%
|
|=== | 4%
|
|=== | 5%
|
|==== | 5%
|
|==== | 6%
|
|===== | 7%
|
|===== | 8%
|
|====== | 8%
|
|====== | 9%
|
|======= | 9%
|
|======= | 10%
|
|======= | 11%
|
|======== | 11%
|
|======== | 12%
|
|========= | 12%
|
|========= | 13%
|
|========== | 14%
|
|========== | 15%
|
|=========== | 15%
|
|=========== | 16%
|
|============ | 16%
|
|============ | 17%
|
|============ | 18%
|
|============= | 18%
|
|============= | 19%
|
|============== | 19%
|
|============== | 20%
|
|============== | 21%
|
|=============== | 21%
|
|=============== | 22%
|
|================ | 22%
|
|================ | 23%
|
|================= | 24%
|
|================= | 25%
|
|================== | 25%
|
|================== | 26%
|
|=================== | 26%
|
|=================== | 27%
|
|=================== | 28%
|
|==================== | 28%
|
|==================== | 29%
|
|===================== | 29%
|
|===================== | 30%
|
|===================== | 31%
|
|====================== | 31%
|
|====================== | 32%
|
|======================= | 32%
|
|======================= | 33%
|
|======================== | 34%
|
|======================== | 35%
|
|========================= | 35%
|
|========================= | 36%
|
|========================== | 36%
|
|========================== | 37%
|
|========================== | 38%
|
|=========================== | 38%
|
|=========================== | 39%
|
|============================ | 39%
|
|============================ | 40%
|
|============================ | 41%
|
|============================= | 41%
|
|============================= | 42%
|
|============================== | 42%
|
|============================== | 43%
|
|=============================== | 44%
|
|=============================== | 45%
|
|================================ | 45%
|
|================================ | 46%
|
|================================= | 47%
|
|================================= | 48%
|
|================================== | 48%
|
|================================== | 49%
|
|=================================== | 49%
|
|=================================== | 50%
|
|=================================== | 51%
|
|==================================== | 51%
|
|==================================== | 52%
|
|===================================== | 52%
|
|===================================== | 53%
|
|====================================== | 54%
|
|====================================== | 55%
|
|======================================= | 55%
|
|======================================= | 56%
|
|======================================== | 57%
|
|======================================== | 58%
|
|========================================= | 58%
|
|========================================= | 59%
|
|========================================== | 59%
|
|========================================== | 60%
|
|========================================== | 61%
|
|=========================================== | 61%
|
|=========================================== | 62%
|
|============================================ | 62%
|
|============================================ | 63%
|
|============================================ | 64%
|
|============================================= | 64%
|
|============================================= | 65%
|
|============================================== | 65%
|
|============================================== | 66%
|
|=============================================== | 67%
|
|=============================================== | 68%
|
|================================================ | 68%
|
|================================================ | 69%
|
|================================================= | 69%
|
|================================================= | 70%
|
|================================================= | 71%
|
|================================================== | 71%
|
|================================================== | 72%
|
|=================================================== | 72%
|
|=================================================== | 73%
|
|=================================================== | 74%
|
|==================================================== | 74%
|
|==================================================== | 75%
|
|===================================================== | 75%
|
|===================================================== | 76%
|
|====================================================== | 77%
|
|====================================================== | 78%
|
|======================================================= | 78%
|
|======================================================= | 79%
|
|======================================================== | 79%
|
|======================================================== | 80%
|
|======================================================== | 81%
|
|========================================================= | 81%
|
|========================================================= | 82%
|
|========================================================== | 82%
|
|========================================================== | 83%
|
|========================================================== | 84%
|
|=========================================================== | 84%
|
|=========================================================== | 85%
|
|============================================================ | 85%
|
|============================================================ | 86%
|
|============================================================= | 87%
|
|============================================================= | 88%
|
|============================================================== | 88%
|
|============================================================== | 89%
|
|=============================================================== | 89%
|
|=============================================================== | 90%
|
|=============================================================== | 91%
|
|================================================================ | 91%
|
|================================================================ | 92%
|
|================================================================= | 92%
|
|================================================================= | 93%
|
|================================================================== | 94%
|
|================================================================== | 95%
|
|=================================================================== | 95%
|
|=================================================================== | 96%
|
|==================================================================== | 97%
|
|==================================================================== | 98%
|
|===================================================================== | 98%
|
|===================================================================== | 99%
|
|======================================================================| 99%
|
|======================================================================| 100%
## Constructing chromVAR assay
## Warning: Feature names cannot have underscores ('_'), replacing with dashes
## ('-')
This took a while but we finally have the chromVAR
scores that serve as a proxy for TF activity. Let’s check them
out now!
First, we find all markers across all clusters based on the
chromVAR scores given the identity of cells as defined
previously using the dimensionality reduction methods we employed.
DefaultAssay(seu.s) <- 'chromvar'
Idents(seu.s) <- seu.s$ATAC_snn_res.0.5
diffTF_all <- FindAllMarkers(seu.s, only.pos = TRUE, min.pct = 0.25,
logfc.threshold = 0.5, mean.fxn = rowMeans, fc.name = "avg_diff", verbose = FALSE)
Time to visualize our results!
gr1 = diffTF_all %>%
group_by(cluster) %>%
slice_max(n = 3, order_by = avg_diff)
gr1
for (cl in sort(unique(gr1$cluster))) {
fts = gr1 %>% filter(cluster == cl) %>% pull(gene)
p1 = VlnPlot(seu.s, features = fts, slot = "data", log = TRUE, pt.size = 0.1, ) +
ylab("TF activity level") & theme(plot.title = element_text(size = 7))
p2 = FeaturePlot(seu.s, features = fts,
min.cutoff = 'q5',max.cutoff = 'q95',
pt.size = 0.1, ncol = 3, reduction = "umap") & theme(plot.title = element_text(size = 7))
pall = p1 / p2 + plot_annotation(title = paste0("Cluster ",cl))
# Let's not print the warnings about non-finite and missing values
suppressWarnings(print(pall))
}
Let’s again create a heatmap, similar to the heatmap for the gene
activities, this time with the chromVAR values:
diffTF_all_top10 <- diffTF_all %>%
group_by(cluster) %>%
top_n(n = 10, wt = avg_diff)
DoHeatmap(seu.s, features = diffTF_all_top10$gene, slot = "data")
Analysis of single-cell chromatin accessibility data generally allows
us to detect often novel cell clusters, while how to interpret these
clusters remains a big challenge. Let’s try to understand the clusters
better. For this, let’s calculate now the differential TF activity
between a pair of clusters based on chromVAR scores! Note
that this is a different question from what we have done before: So far,
we only identified cluster-specific TFs, while now, we would like to
which TFs make two clusters different!
Thus, instead of finding all TFs that are deemed differential among
all clusters, we can also specifically find marker TFs or marker peaks
that are differential only when comparing a particular set of clusters.
Let’s try this out! Instead of the function FindAllMarkers,
we will use the function FindMarkers.
Let’s first calculate differential TF activity between cluster
1 as compare to cluster 0.
# mean.fxn = Function to use for fold change or average difference calculation
differential.activity <- FindMarkers(
object = seu.s,
ident.1 = '1', ident.2 = '0',
only.pos = TRUE, mean.fxn = rowMeans,
fc.name = "avg_diff",
assay = 'chromvar',
verbose = FALSE)
Now that we identified a few TF that are differentially active between the two specified clusters, let’s try to see what their TF binding motifs (TFBMs) look like. The consensus sequence of a TFBM is variable, and there are a number of possible bases at certain positions in the motif, whereas other positions have a fixed base. These are usually illustrated in sequence logo diagrams (see below), where the height of the letter represents how frequently that nucleotide is observed in that position.
We here plot the first 5 TFBMs to peek into the data:
TFs = rownames(differential.activity) %>% head(5) %>% str_replace("-","_")
MotifPlot(object = seu.s, motifs = TFs, assay = 'ATAC')
## Warning: `guides(<scale> = FALSE)` is deprecated. Please use `guides(<scale> =
## "none")` instead.
Alternatively, to identify potentially important cell-type-specific regulatory sequences, we can search for DNA motifs that are overrepresented in a set of peaks that are differentially accessible between cell types / clusters.
To find them, we can perform a differential accessibility
(DA) test. Here, we utilize a method called logistic regression for DA,
as suggested by Ntranos et al.
2018 for scRNA-seq data, and add the total number of fragments as a
latent variable to mitigate the effect of differential sequencing depth
on the result. In addition, for sparse data such as scATAC-seq, it is
often necessary to lower the min.pct threshold in
FindMarkers() from the default (0.1, which was designed for
scRNA-seq data).
Note that we now have to use our original ATAC assay,
and not the chromVAR assay as before!
# Now, we work again with the ATAC assay and not the chromVAR assay
DefaultAssay(seu.s) = "ATAC"
# min.pct: only test genes that are detected in a minimum fraction of min.pct cells in either of the two population
da_peaks <- FindMarkers(
object = seu.s,ident.1 = '0',ident.2 = '1',
only.pos = TRUE, test.use = 'LR', min.pct = 0.05,
latent.vars = 'peak_region_fragments',
verbose = FALSE)
# get top differentially accessible peaks
top.da.peak <- rownames(da_peaks[da_peaks$p_val < 0.005, ])
We now have a set of differentially active peaks, but how can we judge whether they are statistically meaningful? For proper enrichment analyses and statistics, matching the set of background peaks is essential when finding enriched DNA sequence motifs. A good choice is usually to choose a set of peaks matched for GC content, but it can be sometimes be beneficial to further restrict the background peaks to those that are accessible in the groups of cells compared when finding differentially accessible peaks.
The AccessiblePeaks() function can be used to find a set
of peaks that are open in a subset of cells. We can use this function to
first restrict the set of possible background peaks to those peaks that
were open in the set of cells compared in FindMarkers(),
and then create a GC-content-matched set of peaks from this larger set
using MatchRegionStats().
# find open peaks
open.peaks <- AccessiblePeaks(seu.s, idents = c("1", "0"))
# match the overall GC content in the peak set
# Calculate meta features for object
seu.s <- RegionStats(object = seu.s, genome = BSgenome.Mmusculus.UCSC.mm10, assay = "ATAC")
meta.feature <- GetAssayData(seu.s, assay = "ATAC", slot = "meta.features")
peaks.matched <- MatchRegionStats(
meta.feature = meta.feature[open.peaks, ],
query.feature = meta.feature[top.da.peak, ],
n = 50000
)
## Matching GC.percent distribution
We are now ready to perform a statistical test to quantify the probability of observing the motif at the given frequency by chance, comparing with a background set of peaks matched for GC content, by specifying a foreground and the previously calculated matched background:
# test enrichment
enriched.motifs <- FindMotifs(
object = seu.s,
features = top.da.peak,
background = peaks.matched
)
## Testing motif enrichment in 400 regions
# Let's check the results
enriched.motifs
We can also plot the position weight matrices for the motifs, so we can visualize the different motif sequences. it is finally the time to visualize our results!
TFs = str_replace(head(rownames(enriched.motifs)),"-","_")
MotifPlot(object = seu.s,motifs = TFs,assay = 'ATAC')
## Warning: `guides(<scale> = FALSE)` is deprecated. Please use `guides(<scale> =
## "none")` instead.
We can also visualize these marker peaks on a violin plot, feature plot, dot plot, heat map, or any visualization tool in Seurat.
Finally, as an exercise, you can now repeat the same procedure; but this time, we identify peaks the other way around: Peaks that are more accessible in cluster 1 as compared to cluster 0
Transcription factor (TF) footprinting allows for the prediction of the precise binding location of a TF at a particular locus. For both bulk and scATAC-Seq, this is because the DNA bases that are directly bound by the TF are actually protected from transposition while the DNA bases immediately adjacent to TF binding are accessible, thus leaving a footprint. This allows us to generate average ATAC-seq profiles around binding sites of particular TF (a TF footprint). The last task in this vignette is to generate some TF footprints!
Ideally, TF footprinting is performed at a single site to determine the precise binding location of the TF. However, in practice, this requires very high sequencing depth, often much higher depth than what most users would obtain from either bulk or single-cell ATAC-seq. To get around this problem, we can combine Tn5 insertion locations across many instances of predicted TF binding. For example, we can take all peaks that harbor a CTCF motif and make an aggregate TF footprint for CTCF across the whole genome. The accuracy of this footprint relies on generating a reliable curated list of predicted binding sites for the TF of interest.
To prepare this, we need to add the HOCOMOCO motifs to our
object in a way that is compatible with the footprinting functions. Why
we need to to this? Generally, to facilitate motif analysis in
Signac, we have to create the Motif class to
store all the required information, including a list of position
weight matrices (PWMs) or position frequency matrices
(PFMs) and a motif occurrence matrix. Here, the AddMotifs()
function construct a Motif object and adds it to our
dataset, along with other information such as the base composition of
each peak. Even more generally, a motif object can be added to any
Seurat-compatible assay using the
SetAssayData() function. The following chunk takes a few
minutes to run, so please be patient.
DefaultAssay(seu.s) <- 'ATAC'
seu.s <- AddMotifs(seu.s, genome = BSgenome.Mmusculus.UCSC.mm10, pfm = motifsProcessed)
## Building motif matrix
## Finding motif positions
## Creating Motif object
Alternatively, if we used a different motif database such as
JASPAR, we could add it to our object in a similar
fashion:
library(JASPAR2020)
# Potentially specifying the species is also needed here
motifDB = getMatrixSet(JASPAR2020, opts = list(collection = "CORE", tax_group = 'vertebrates', all_versions = FALSE))
)
Now we can footprint any motif that we have positional information
for. Here, we can run it just for a few selected TFs. The footprinting
function computes the normalized observed vs. expected Tn5 insertion
frequency for each position surrounding a set of motif instances. By
default, this includes every instance of the motif in the genome. We can
instead use the in.peaks = TRUE parameter to include only
those motifs that fall inside a peak in the assay. The
Footprint() function gathers all the required data and
stores it in the assay.
Again, this will take some minutes to run.
# Pre-selected TFs that should be more active for cluster 0 and 1. ERR3 + SOX2 should be more in ES cells (cluster 1) and TEAD4 + PITX1 in neuronal ones (cluster 0).
TFs = c("ERR3_MOUSE.H11MO.0.C", "SOX2_MOUSE.H11MO.0.A", "TEAD4_MOUSE.H11MO.0.A","PITX1_MOUSE.H11MO.0.C")
# gather the footprinting information for sets of motifs
seu.s <- Footprint(
object = seu.s,
motif.name = TFs,
in.peaks = FALSE,
genome = BSgenome.Mmusculus.UCSC.mm10
)
## Computing Tn5 insertion bias
## Extracting reads in requested region
## Computing observed Tn5 insertions per base
## Warning in .Seqinfo.mergexy(x, y): Each of the 2 combined objects has sequence levels not in the other:
## - in 'x': chrM, chr1_GL455991_alt, chr1_GL455992_alt, chr1_GL455993_alt, chr1_GL456005_alt, chr1_JH584315_alt, chr1_JH584320_alt, chr1_JH584321_alt, chr1_JH584322_alt, chr2_GL456024_alt, chr3_GL456006_alt, chr3_GL456007_alt, chr3_GL456008_alt, chr3_GL456042_alt, chr3_GL456044_alt, chr3_GL456045_alt, chr3_GL456048_alt, chr3_GL456049_alt, chr3_JH584323_alt, chr4_GL455994_alt, chr4_GL456009_alt, chr4_GL456010_alt, chr4_GL456053_alt, chr4_GL456064_alt, chr4_GL456075_alt, chr4_GL456076_alt, chr4_GL456077_alt, chr4_JH584268_alt, chr4_JH584269_alt, chr4_JH584324_alt, chr4_JH584325_alt, chr4_JH584326_alt, chr5_GL455995_alt, chr5_GL456011_alt, chr6_GL456012_alt, chr6_GL456025_alt, chr6_GL456026_alt, chr6_GL456054_alt, chr6_GL456065_alt, chr6_JH584264_alt, chr7_GL455989_alt, chr7_GL456013_alt, chr7_GL456014_alt, chr8_GL455996_alt, chr8_GL455997_alt, chr10_GL456015_alt, chr11_GL455998_alt, chr11_GL456016_alt, chr11_GL456060_alt, chr11_JH584265_alt, chr11_JH584316_alt, chr11_JH584317_alt, chr11_JH584327_alt, chr12_GL456017_alt, chr12_GL456068_alt, chr12_GL456074_alt, chr12_GL456078_alt, chr12_GL456349_alt, chr13_GL455990_alt, chr13_GL455999_alt, chr13_JH584305_alt, chr14_GL456019_alt, chr14_GL456020_alt, chr15_GL456000_alt, chr15_JH584270_alt, chr16_GL456001_alt, chr16_GL456028_alt, chr16_JH584306_alt, chr16_JH584307_alt, chr16_JH584310_alt, chr16_JH584311_alt, chr16_JH584312_alt, chr16_JH584313_alt, chr16_JH584314_alt, chr17_GL456002_alt, chr17_GL456021_alt, chr17_GL456022_alt, chr17_GL456069_alt, chr17_JH584266_alt, chr17_JH584267_alt, chr17_JH584308_alt, chr17_JH584309_alt, chr17_JH584328_alt, chr17_JH590470_alt, chr18_GL456070_alt, chr18_JH584318_alt, chr19_GL456079_alt, chr19_JH584319_alt, chrX_GL456003_alt, chrX_GL456004_alt, chrX_GL456031_alt, chrX_GL456032_alt, chrX_GL456033_alt, chrY_GL456071_alt, chrY_GL456072_alt, chrY_GL456073_alt, chrY_GL456080_alt, chrY_GL456081_alt, chrY_GL456082_alt, chrna_GL456050_alt, chr1_GL456210_random, chr1_GL456211_random, chr1_GL456212_random, chr1_GL456213_random, chr1_GL456221_random, chr4_GL456216_random, chr4_GL456350_random, chr4_JH584292_random, chr4_JH584293_random, chr4_JH584294_random, chr4_JH584295_random, chr5_GL456354_random, chr5_JH584296_random, chr5_JH584297_random, chr5_JH584298_random, chr5_JH584299_random, chr7_GL456219_random, chrX_GL456233_random, chrY_JH584300_random, chrY_JH584301_random, chrY_JH584302_random, chrY_JH584303_random, chrUn_GL456239, chrUn_GL456359, chrUn_GL456360, chrUn_GL456366, chrUn_GL456367, chrUn_GL456368, chrUn_GL456370, chrUn_GL456372, chrUn_GL456378, chrUn_GL456379, chrUn_GL456381, chrUn_GL456382, chrUn_GL456383, chrUn_GL456385, chrUn_GL456387, chrUn_GL456389, chrUn_GL456390, chrUn_GL456392, chrUn_GL456393, chrUn_GL456394, chrUn_GL456396, chrUn_JH584304, chr1_KV575232_fix, chr1_KV575233_fix, chr1_KV575234_fix, chr2_KV575235_fix, chr2_KV575236_fix, chr3_KQ030484_fix, chr3_KZ289064_fix, chr3_KZ289065_fix, chr3_KZ289066_fix, chr4_JH792826_fix, chr4_KQ030485_fix, chr4_KQ030486_fix, chr4_KQ030487_fix, chr4_KQ030488_fix, chr4_KQ030489_fix, chr4_KZ289067_fix, chr4_KZ289068_fix, chr4_KZ289069_fix, chr4_KZ289070_fix, chr5_JH792827_fix, chr5_KV575237_fix, chr6_KK082442_fix, chr6_KK082443_fix, chr7_JH792828_fix, chr7_KV575238_fix, chr7_KV575239_fix, chr8_KV575240_fix, chr9_KB469738_fix, chr9_KQ030490_fix, chr10_KQ030491_fix, chr10_KZ289071_fix, chr10_KZ289072_fix, chr11_KB469739_fix, chr11_KZ289076_fix, chr12_KB469740_fix, chr12_KZ289082_fix, chr14_KZ289083_fix, chr14_KZ289084_fix, chr15_KQ030492_fix, chr15_KQ030493_fix, chr15_KV575241_fix, chr15_KZ289085_fix, chr15_KZ289086_fix, chr16_KB469741_fix, chr17_KB469742_fix, chr17_KZ289087_fix, chr17_KZ289088_fix, chr17_KZ289089_fix, chr18_JH792829_fix, chr18_KZ289090_fix, chr18_KZ289091_fix, chr19_JH792830_fix, chr19_KQ030494_fix, chr19_KV575242_fix, chrX_JH792831_fix, chrX_KQ030495_fix, chrX_KQ030496_fix, chrX_KQ030497_fix, chrX_KZ289092_fix, chrX_KZ289093_fix, chrX_KZ289094_fix, chrX_KZ289095_fix, chrY_JH792832_fix, chrY_JH792833_fix, chrY_JH792834_fix, chr1_KK082441_alt, chr11_KZ289073_alt, chr11_KZ289074_alt, chr11_KZ289075_alt, chr11_KZ289077_alt, chr11_KZ289078_alt, chr11_KZ289079_alt, chr11_KZ289080_alt, chr11_KZ289081_alt
## - in 'y': GL456216.1, GL456221.1, GL456233.1, JH584295.1, JH584304.1
## Make sure to always combine/compare objects based on the same reference
## genome (use suppressWarnings() to suppress this warning).
## Computing base composition at motif sites
## Computing expected Tn5 insertions per base
## Finding + strand cut sites
## Finding - strand cut sites
## Computing observed Tn5 insertions per base
## Warning in .Seqinfo.mergexy(x, y): Each of the 2 combined objects has sequence levels not in the other:
## - in 'x': chrM, chr1_GL455991_alt, chr1_GL455992_alt, chr1_GL455993_alt, chr1_GL456005_alt, chr1_JH584315_alt, chr1_JH584320_alt, chr1_JH584321_alt, chr1_JH584322_alt, chr2_GL456024_alt, chr3_GL456006_alt, chr3_GL456007_alt, chr3_GL456008_alt, chr3_GL456042_alt, chr3_GL456044_alt, chr3_GL456045_alt, chr3_GL456048_alt, chr3_GL456049_alt, chr3_JH584323_alt, chr4_GL455994_alt, chr4_GL456009_alt, chr4_GL456010_alt, chr4_GL456053_alt, chr4_GL456064_alt, chr4_GL456075_alt, chr4_GL456076_alt, chr4_GL456077_alt, chr4_JH584268_alt, chr4_JH584269_alt, chr4_JH584324_alt, chr4_JH584325_alt, chr4_JH584326_alt, chr5_GL455995_alt, chr5_GL456011_alt, chr6_GL456012_alt, chr6_GL456025_alt, chr6_GL456026_alt, chr6_GL456054_alt, chr6_GL456065_alt, chr6_JH584264_alt, chr7_GL455989_alt, chr7_GL456013_alt, chr7_GL456014_alt, chr8_GL455996_alt, chr8_GL455997_alt, chr10_GL456015_alt, chr11_GL455998_alt, chr11_GL456016_alt, chr11_GL456060_alt, chr11_JH584265_alt, chr11_JH584316_alt, chr11_JH584317_alt, chr11_JH584327_alt, chr12_GL456017_alt, chr12_GL456068_alt, chr12_GL456074_alt, chr12_GL456078_alt, chr12_GL456349_alt, chr13_GL455990_alt, chr13_GL455999_alt, chr13_JH584305_alt, chr14_GL456019_alt, chr14_GL456020_alt, chr15_GL456000_alt, chr15_JH584270_alt, chr16_GL456001_alt, chr16_GL456028_alt, chr16_JH584306_alt, chr16_JH584307_alt, chr16_JH584310_alt, chr16_JH584311_alt, chr16_JH584312_alt, chr16_JH584313_alt, chr16_JH584314_alt, chr17_GL456002_alt, chr17_GL456021_alt, chr17_GL456022_alt, chr17_GL456069_alt, chr17_JH584266_alt, chr17_JH584267_alt, chr17_JH584308_alt, chr17_JH584309_alt, chr17_JH584328_alt, chr17_JH590470_alt, chr18_GL456070_alt, chr18_JH584318_alt, chr19_GL456079_alt, chr19_JH584319_alt, chrX_GL456003_alt, chrX_GL456004_alt, chrX_GL456031_alt, chrX_GL456032_alt, chrX_GL456033_alt, chrY_GL456071_alt, chrY_GL456072_alt, chrY_GL456073_alt, chrY_GL456080_alt, chrY_GL456081_alt, chrY_GL456082_alt, chrna_GL456050_alt, chr1_GL456210_random, chr1_GL456211_random, chr1_GL456212_random, chr1_GL456213_random, chr1_GL456221_random, chr4_GL456216_random, chr4_GL456350_random, chr4_JH584292_random, chr4_JH584293_random, chr4_JH584294_random, chr4_JH584295_random, chr5_GL456354_random, chr5_JH584296_random, chr5_JH584297_random, chr5_JH584298_random, chr5_JH584299_random, chr7_GL456219_random, chrX_GL456233_random, chrY_JH584300_random, chrY_JH584301_random, chrY_JH584302_random, chrY_JH584303_random, chrUn_GL456239, chrUn_GL456359, chrUn_GL456360, chrUn_GL456366, chrUn_GL456367, chrUn_GL456368, chrUn_GL456370, chrUn_GL456372, chrUn_GL456378, chrUn_GL456379, chrUn_GL456381, chrUn_GL456382, chrUn_GL456383, chrUn_GL456385, chrUn_GL456387, chrUn_GL456389, chrUn_GL456390, chrUn_GL456392, chrUn_GL456393, chrUn_GL456394, chrUn_GL456396, chrUn_JH584304, chr1_KV575232_fix, chr1_KV575233_fix, chr1_KV575234_fix, chr2_KV575235_fix, chr2_KV575236_fix, chr3_KQ030484_fix, chr3_KZ289064_fix, chr3_KZ289065_fix, chr3_KZ289066_fix, chr4_JH792826_fix, chr4_KQ030485_fix, chr4_KQ030486_fix, chr4_KQ030487_fix, chr4_KQ030488_fix, chr4_KQ030489_fix, chr4_KZ289067_fix, chr4_KZ289068_fix, chr4_KZ289069_fix, chr4_KZ289070_fix, chr5_JH792827_fix, chr5_KV575237_fix, chr6_KK082442_fix, chr6_KK082443_fix, chr7_JH792828_fix, chr7_KV575238_fix, chr7_KV575239_fix, chr8_KV575240_fix, chr9_KB469738_fix, chr9_KQ030490_fix, chr10_KQ030491_fix, chr10_KZ289071_fix, chr10_KZ289072_fix, chr11_KB469739_fix, chr11_KZ289076_fix, chr12_KB469740_fix, chr12_KZ289082_fix, chr14_KZ289083_fix, chr14_KZ289084_fix, chr15_KQ030492_fix, chr15_KQ030493_fix, chr15_KV575241_fix, chr15_KZ289085_fix, chr15_KZ289086_fix, chr16_KB469741_fix, chr17_KB469742_fix, chr17_KZ289087_fix, chr17_KZ289088_fix, chr17_KZ289089_fix, chr18_JH792829_fix, chr18_KZ289090_fix, chr18_KZ289091_fix, chr19_JH792830_fix, chr19_KQ030494_fix, chr19_KV575242_fix, chrX_JH792831_fix, chrX_KQ030495_fix, chrX_KQ030496_fix, chrX_KQ030497_fix, chrX_KZ289092_fix, chrX_KZ289093_fix, chrX_KZ289094_fix, chrX_KZ289095_fix, chrY_JH792832_fix, chrY_JH792833_fix, chrY_JH792834_fix, chr1_KK082441_alt, chr11_KZ289073_alt, chr11_KZ289074_alt, chr11_KZ289075_alt, chr11_KZ289077_alt, chr11_KZ289078_alt, chr11_KZ289079_alt, chr11_KZ289080_alt, chr11_KZ289081_alt
## - in 'y': GL456216.1, GL456221.1, GL456233.1, JH584295.1, JH584304.1
## Make sure to always combine/compare objects based on the same reference
## genome (use suppressWarnings() to suppress this warning).
## Computing base composition at motif sites
## Computing expected Tn5 insertions per base
## Finding + strand cut sites
## Finding - strand cut sites
## Computing observed Tn5 insertions per base
## Warning in .Seqinfo.mergexy(x, y): Each of the 2 combined objects has sequence levels not in the other:
## - in 'x': chrM, chr1_GL455991_alt, chr1_GL455992_alt, chr1_GL455993_alt, chr1_GL456005_alt, chr1_JH584315_alt, chr1_JH584320_alt, chr1_JH584321_alt, chr1_JH584322_alt, chr2_GL456024_alt, chr3_GL456006_alt, chr3_GL456007_alt, chr3_GL456008_alt, chr3_GL456042_alt, chr3_GL456044_alt, chr3_GL456045_alt, chr3_GL456048_alt, chr3_GL456049_alt, chr3_JH584323_alt, chr4_GL455994_alt, chr4_GL456009_alt, chr4_GL456010_alt, chr4_GL456053_alt, chr4_GL456064_alt, chr4_GL456075_alt, chr4_GL456076_alt, chr4_GL456077_alt, chr4_JH584268_alt, chr4_JH584269_alt, chr4_JH584324_alt, chr4_JH584325_alt, chr4_JH584326_alt, chr5_GL455995_alt, chr5_GL456011_alt, chr6_GL456012_alt, chr6_GL456025_alt, chr6_GL456026_alt, chr6_GL456054_alt, chr6_GL456065_alt, chr6_JH584264_alt, chr7_GL455989_alt, chr7_GL456013_alt, chr7_GL456014_alt, chr8_GL455996_alt, chr8_GL455997_alt, chr10_GL456015_alt, chr11_GL455998_alt, chr11_GL456016_alt, chr11_GL456060_alt, chr11_JH584265_alt, chr11_JH584316_alt, chr11_JH584317_alt, chr11_JH584327_alt, chr12_GL456017_alt, chr12_GL456068_alt, chr12_GL456074_alt, chr12_GL456078_alt, chr12_GL456349_alt, chr13_GL455990_alt, chr13_GL455999_alt, chr13_JH584305_alt, chr14_GL456019_alt, chr14_GL456020_alt, chr15_GL456000_alt, chr15_JH584270_alt, chr16_GL456001_alt, chr16_GL456028_alt, chr16_JH584306_alt, chr16_JH584307_alt, chr16_JH584310_alt, chr16_JH584311_alt, chr16_JH584312_alt, chr16_JH584313_alt, chr16_JH584314_alt, chr17_GL456002_alt, chr17_GL456021_alt, chr17_GL456022_alt, chr17_GL456069_alt, chr17_JH584266_alt, chr17_JH584267_alt, chr17_JH584308_alt, chr17_JH584309_alt, chr17_JH584328_alt, chr17_JH590470_alt, chr18_GL456070_alt, chr18_JH584318_alt, chr19_GL456079_alt, chr19_JH584319_alt, chrX_GL456003_alt, chrX_GL456004_alt, chrX_GL456031_alt, chrX_GL456032_alt, chrX_GL456033_alt, chrY_GL456071_alt, chrY_GL456072_alt, chrY_GL456073_alt, chrY_GL456080_alt, chrY_GL456081_alt, chrY_GL456082_alt, chrna_GL456050_alt, chr1_GL456210_random, chr1_GL456211_random, chr1_GL456212_random, chr1_GL456213_random, chr1_GL456221_random, chr4_GL456216_random, chr4_GL456350_random, chr4_JH584292_random, chr4_JH584293_random, chr4_JH584294_random, chr4_JH584295_random, chr5_GL456354_random, chr5_JH584296_random, chr5_JH584297_random, chr5_JH584298_random, chr5_JH584299_random, chr7_GL456219_random, chrX_GL456233_random, chrY_JH584300_random, chrY_JH584301_random, chrY_JH584302_random, chrY_JH584303_random, chrUn_GL456239, chrUn_GL456359, chrUn_GL456360, chrUn_GL456366, chrUn_GL456367, chrUn_GL456368, chrUn_GL456370, chrUn_GL456372, chrUn_GL456378, chrUn_GL456379, chrUn_GL456381, chrUn_GL456382, chrUn_GL456383, chrUn_GL456385, chrUn_GL456387, chrUn_GL456389, chrUn_GL456390, chrUn_GL456392, chrUn_GL456393, chrUn_GL456394, chrUn_GL456396, chrUn_JH584304, chr1_KV575232_fix, chr1_KV575233_fix, chr1_KV575234_fix, chr2_KV575235_fix, chr2_KV575236_fix, chr3_KQ030484_fix, chr3_KZ289064_fix, chr3_KZ289065_fix, chr3_KZ289066_fix, chr4_JH792826_fix, chr4_KQ030485_fix, chr4_KQ030486_fix, chr4_KQ030487_fix, chr4_KQ030488_fix, chr4_KQ030489_fix, chr4_KZ289067_fix, chr4_KZ289068_fix, chr4_KZ289069_fix, chr4_KZ289070_fix, chr5_JH792827_fix, chr5_KV575237_fix, chr6_KK082442_fix, chr6_KK082443_fix, chr7_JH792828_fix, chr7_KV575238_fix, chr7_KV575239_fix, chr8_KV575240_fix, chr9_KB469738_fix, chr9_KQ030490_fix, chr10_KQ030491_fix, chr10_KZ289071_fix, chr10_KZ289072_fix, chr11_KB469739_fix, chr11_KZ289076_fix, chr12_KB469740_fix, chr12_KZ289082_fix, chr14_KZ289083_fix, chr14_KZ289084_fix, chr15_KQ030492_fix, chr15_KQ030493_fix, chr15_KV575241_fix, chr15_KZ289085_fix, chr15_KZ289086_fix, chr16_KB469741_fix, chr17_KB469742_fix, chr17_KZ289087_fix, chr17_KZ289088_fix, chr17_KZ289089_fix, chr18_JH792829_fix, chr18_KZ289090_fix, chr18_KZ289091_fix, chr19_JH792830_fix, chr19_KQ030494_fix, chr19_KV575242_fix, chrX_JH792831_fix, chrX_KQ030495_fix, chrX_KQ030496_fix, chrX_KQ030497_fix, chrX_KZ289092_fix, chrX_KZ289093_fix, chrX_KZ289094_fix, chrX_KZ289095_fix, chrY_JH792832_fix, chrY_JH792833_fix, chrY_JH792834_fix, chr1_KK082441_alt, chr11_KZ289073_alt, chr11_KZ289074_alt, chr11_KZ289075_alt, chr11_KZ289077_alt, chr11_KZ289078_alt, chr11_KZ289079_alt, chr11_KZ289080_alt, chr11_KZ289081_alt
## - in 'y': GL456216.1, GL456221.1, GL456233.1, JH584295.1, JH584304.1
## Make sure to always combine/compare objects based on the same reference
## genome (use suppressWarnings() to suppress this warning).
## Computing base composition at motif sites
## Computing expected Tn5 insertions per base
## Finding + strand cut sites
## Finding - strand cut sites
## Computing observed Tn5 insertions per base
## Warning in .Seqinfo.mergexy(x, y): Each of the 2 combined objects has sequence levels not in the other:
## - in 'x': chrM, chr1_GL455991_alt, chr1_GL455992_alt, chr1_GL455993_alt, chr1_GL456005_alt, chr1_JH584315_alt, chr1_JH584320_alt, chr1_JH584321_alt, chr1_JH584322_alt, chr2_GL456024_alt, chr3_GL456006_alt, chr3_GL456007_alt, chr3_GL456008_alt, chr3_GL456042_alt, chr3_GL456044_alt, chr3_GL456045_alt, chr3_GL456048_alt, chr3_GL456049_alt, chr3_JH584323_alt, chr4_GL455994_alt, chr4_GL456009_alt, chr4_GL456010_alt, chr4_GL456053_alt, chr4_GL456064_alt, chr4_GL456075_alt, chr4_GL456076_alt, chr4_GL456077_alt, chr4_JH584268_alt, chr4_JH584269_alt, chr4_JH584324_alt, chr4_JH584325_alt, chr4_JH584326_alt, chr5_GL455995_alt, chr5_GL456011_alt, chr6_GL456012_alt, chr6_GL456025_alt, chr6_GL456026_alt, chr6_GL456054_alt, chr6_GL456065_alt, chr6_JH584264_alt, chr7_GL455989_alt, chr7_GL456013_alt, chr7_GL456014_alt, chr8_GL455996_alt, chr8_GL455997_alt, chr10_GL456015_alt, chr11_GL455998_alt, chr11_GL456016_alt, chr11_GL456060_alt, chr11_JH584265_alt, chr11_JH584316_alt, chr11_JH584317_alt, chr11_JH584327_alt, chr12_GL456017_alt, chr12_GL456068_alt, chr12_GL456074_alt, chr12_GL456078_alt, chr12_GL456349_alt, chr13_GL455990_alt, chr13_GL455999_alt, chr13_JH584305_alt, chr14_GL456019_alt, chr14_GL456020_alt, chr15_GL456000_alt, chr15_JH584270_alt, chr16_GL456001_alt, chr16_GL456028_alt, chr16_JH584306_alt, chr16_JH584307_alt, chr16_JH584310_alt, chr16_JH584311_alt, chr16_JH584312_alt, chr16_JH584313_alt, chr16_JH584314_alt, chr17_GL456002_alt, chr17_GL456021_alt, chr17_GL456022_alt, chr17_GL456069_alt, chr17_JH584266_alt, chr17_JH584267_alt, chr17_JH584308_alt, chr17_JH584309_alt, chr17_JH584328_alt, chr17_JH590470_alt, chr18_GL456070_alt, chr18_JH584318_alt, chr19_GL456079_alt, chr19_JH584319_alt, chrX_GL456003_alt, chrX_GL456004_alt, chrX_GL456031_alt, chrX_GL456032_alt, chrX_GL456033_alt, chrY_GL456071_alt, chrY_GL456072_alt, chrY_GL456073_alt, chrY_GL456080_alt, chrY_GL456081_alt, chrY_GL456082_alt, chrna_GL456050_alt, chr1_GL456210_random, chr1_GL456211_random, chr1_GL456212_random, chr1_GL456213_random, chr1_GL456221_random, chr4_GL456216_random, chr4_GL456350_random, chr4_JH584292_random, chr4_JH584293_random, chr4_JH584294_random, chr4_JH584295_random, chr5_GL456354_random, chr5_JH584296_random, chr5_JH584297_random, chr5_JH584298_random, chr5_JH584299_random, chr7_GL456219_random, chrX_GL456233_random, chrY_JH584300_random, chrY_JH584301_random, chrY_JH584302_random, chrY_JH584303_random, chrUn_GL456239, chrUn_GL456359, chrUn_GL456360, chrUn_GL456366, chrUn_GL456367, chrUn_GL456368, chrUn_GL456370, chrUn_GL456372, chrUn_GL456378, chrUn_GL456379, chrUn_GL456381, chrUn_GL456382, chrUn_GL456383, chrUn_GL456385, chrUn_GL456387, chrUn_GL456389, chrUn_GL456390, chrUn_GL456392, chrUn_GL456393, chrUn_GL456394, chrUn_GL456396, chrUn_JH584304, chr1_KV575232_fix, chr1_KV575233_fix, chr1_KV575234_fix, chr2_KV575235_fix, chr2_KV575236_fix, chr3_KQ030484_fix, chr3_KZ289064_fix, chr3_KZ289065_fix, chr3_KZ289066_fix, chr4_JH792826_fix, chr4_KQ030485_fix, chr4_KQ030486_fix, chr4_KQ030487_fix, chr4_KQ030488_fix, chr4_KQ030489_fix, chr4_KZ289067_fix, chr4_KZ289068_fix, chr4_KZ289069_fix, chr4_KZ289070_fix, chr5_JH792827_fix, chr5_KV575237_fix, chr6_KK082442_fix, chr6_KK082443_fix, chr7_JH792828_fix, chr7_KV575238_fix, chr7_KV575239_fix, chr8_KV575240_fix, chr9_KB469738_fix, chr9_KQ030490_fix, chr10_KQ030491_fix, chr10_KZ289071_fix, chr10_KZ289072_fix, chr11_KB469739_fix, chr11_KZ289076_fix, chr12_KB469740_fix, chr12_KZ289082_fix, chr14_KZ289083_fix, chr14_KZ289084_fix, chr15_KQ030492_fix, chr15_KQ030493_fix, chr15_KV575241_fix, chr15_KZ289085_fix, chr15_KZ289086_fix, chr16_KB469741_fix, chr17_KB469742_fix, chr17_KZ289087_fix, chr17_KZ289088_fix, chr17_KZ289089_fix, chr18_JH792829_fix, chr18_KZ289090_fix, chr18_KZ289091_fix, chr19_JH792830_fix, chr19_KQ030494_fix, chr19_KV575242_fix, chrX_JH792831_fix, chrX_KQ030495_fix, chrX_KQ030496_fix, chrX_KQ030497_fix, chrX_KZ289092_fix, chrX_KZ289093_fix, chrX_KZ289094_fix, chrX_KZ289095_fix, chrY_JH792832_fix, chrY_JH792833_fix, chrY_JH792834_fix, chr1_KK082441_alt, chr11_KZ289073_alt, chr11_KZ289074_alt, chr11_KZ289075_alt, chr11_KZ289077_alt, chr11_KZ289078_alt, chr11_KZ289079_alt, chr11_KZ289080_alt, chr11_KZ289081_alt
## - in 'y': GL456216.1, GL456221.1, GL456233.1, JH584295.1, JH584304.1
## Make sure to always combine/compare objects based on the same reference
## genome (use suppressWarnings() to suppress this warning).
## Computing base composition at motif sites
## Computing expected Tn5 insertions per base
## Finding + strand cut sites
## Finding - strand cut sites
We can now plot the footprinted motifs using the
PlotFootprint() function.
TFs = c("ERR3_MOUSE.H11MO.0.C", "SOX2_MOUSE.H11MO.0.A", "TEAD4_MOUSE.H11MO.0.A","PITX1_MOUSE.H11MO.0.C")
# Plot the footprint data for each group of cells and each cluster
p2 <- PlotFootprint(seu.s, features = TFs[1:2])
p2 + patchwork::plot_layout(ncol = 2) & theme(plot.title = element_text(size = 10))
## Warning: Removed 2550 rows containing missing values (geom_label_repel).
## Warning: Removed 2580 rows containing missing values (geom_label_repel).
p3 <- PlotFootprint(seu.s, features = TFs[3:4])
p3 + patchwork::plot_layout(ncol = 2) & theme(plot.title = element_text(size = 10))
## Warning: Removed 2575 rows containing missing values (geom_label_repel).
## Warning: Removed 2550 rows containing missing values (geom_label_repel).
This concludes the TF footprint analysis as part of this vignette!
We reached the end of the vignette. We again save our updated
Seurat object to disk with a new name, in analogy to what
we did in the other vignettes.
saveRDS(seu.s, file = paste0(outFolder,"obj.filt.TFAnalysis.rds"))
It is good practice to print the so-called session info at the end of an R script, which prints all loaded libraries, their versions etc. This can be helpful for reproducibility and recapitulating which package versions have been used to produce the results obtained above.
sessionInfo()
## R version 4.2.0 (2022-04-22)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 20.04.4 LTS
##
## Matrix products: default
## BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## attached base packages:
## [1] stats4 stats graphics grDevices utils datasets methods
## [8] base
##
## other attached packages:
## [1] ggseqlogo_0.1 BiocParallel_1.30.0
## [3] BSgenome.Mmusculus.UCSC.mm10_1.4.3 BSgenome_1.64.0
## [5] rtracklayer_1.56.0 Biostrings_2.64.0
## [7] XVector_0.36.0 GenomicRanges_1.48.0
## [9] GenomeInfoDb_1.32.1 IRanges_2.30.0
## [11] S4Vectors_0.34.0 BiocGenerics_0.42.0
## [13] motifmatchr_1.18.0 universalmotif_1.14.0
## [15] TFBSTools_1.34.0 chromVAR_1.18.0
## [17] patchwork_1.1.1 forcats_0.5.1
## [19] stringr_1.4.0 dplyr_1.0.9
## [21] purrr_0.3.4 readr_2.1.2
## [23] tidyr_1.2.0 tibble_3.1.7
## [25] ggplot2_3.3.6 tidyverse_1.3.1
## [27] sp_1.4-7 SeuratObject_4.1.0
## [29] Seurat_4.1.1 Signac_1.6.0
##
## loaded via a namespace (and not attached):
## [1] SnowballC_0.7.0 scattermore_0.8
## [3] R.methodsS3_1.8.1 nabor_0.5.0
## [5] bit64_4.0.5 knitr_1.39
## [7] irlba_2.3.5 DelayedArray_0.22.0
## [9] R.utils_2.11.0 data.table_1.14.2
## [11] rpart_4.1.16 KEGGREST_1.36.0
## [13] RCurl_1.98-1.6 generics_0.1.2
## [15] cowplot_1.1.1 RSQLite_2.2.13
## [17] RANN_2.6.1 future_1.25.0
## [19] bit_4.0.4 tzdb_0.3.0
## [21] spatstat.data_2.2-0 xml2_1.3.3
## [23] lubridate_1.8.0 httpuv_1.6.5
## [25] SummarizedExperiment_1.26.1 assertthat_0.2.1
## [27] DirichletMultinomial_1.38.0 xfun_0.30
## [29] hms_1.1.1 jquerylib_0.1.4
## [31] evaluate_0.15 promises_1.2.0.1
## [33] fansi_1.0.3 restfulr_0.0.13
## [35] caTools_1.18.2 dbplyr_2.1.1
## [37] readxl_1.4.0 igraph_1.3.1
## [39] DBI_1.1.2 htmlwidgets_1.5.4
## [41] sparsesvd_0.2 spatstat.geom_2.4-0
## [43] ellipsis_0.3.2 backports_1.4.1
## [45] annotate_1.74.0 deldir_1.0-6
## [47] MatrixGenerics_1.8.0 vctrs_0.4.1
## [49] Biobase_2.56.0 ROCR_1.0-11
## [51] abind_1.4-5 cachem_1.0.6
## [53] withr_2.5.0 ggforce_0.3.3
## [55] progressr_0.10.0 sctransform_0.3.3
## [57] GenomicAlignments_1.32.0 goftest_1.2-3
## [59] cluster_2.1.3 lazyeval_0.2.2
## [61] seqLogo_1.62.0 crayon_1.5.1
## [63] pkgconfig_2.0.3 slam_0.1-50
## [65] labeling_0.4.2 tweenr_1.0.2
## [67] nlme_3.1-157 rlang_1.0.2
## [69] globals_0.14.0 lifecycle_1.0.1
## [71] miniUI_0.1.1.1 modelr_0.1.8
## [73] cellranger_1.1.0 polyclip_1.10-0
## [75] matrixStats_0.62.0 lmtest_0.9-40
## [77] Matrix_1.4-1 zoo_1.8-10
## [79] reprex_2.0.1 ggridges_0.5.3
## [81] png_0.1-7 viridisLite_0.4.0
## [83] rjson_0.2.21 bitops_1.0-7
## [85] R.oo_1.24.0 KernSmooth_2.23-20
## [87] blob_1.2.3 parallelly_1.31.1
## [89] spatstat.random_2.2-0 CNEr_1.32.0
## [91] scales_1.2.0 memoise_2.0.1
## [93] magrittr_2.0.3 plyr_1.8.7
## [95] ica_1.0-2 zlibbioc_1.42.0
## [97] compiler_4.2.0 BiocIO_1.6.0
## [99] RColorBrewer_1.1-3 fitdistrplus_1.1-8
## [101] Rsamtools_2.12.0 cli_3.3.0
## [103] listenv_0.8.0 pbapply_1.5-0
## [105] MASS_7.3-56 mgcv_1.8-40
## [107] tidyselect_1.1.2 stringi_1.7.6
## [109] highr_0.9 yaml_2.3.5
## [111] ggrepel_0.9.1 grid_4.2.0
## [113] sass_0.4.1 fastmatch_1.1-3
## [115] tools_4.2.0 future.apply_1.9.0
## [117] parallel_4.2.0 rstudioapi_0.13
## [119] TFMPvalue_0.0.8 lsa_0.73.2
## [121] gridExtra_2.3 farver_2.1.0
## [123] Rtsne_0.16 digest_0.6.29
## [125] rgeos_0.5-9 shiny_1.7.1
## [127] pracma_2.3.8 qlcMatrix_0.9.7
## [129] Rcpp_1.0.8.3 broom_0.8.0
## [131] later_1.3.0 RcppAnnoy_0.0.19
## [133] httr_1.4.3 AnnotationDbi_1.58.0
## [135] colorspace_2.0-3 rvest_1.0.2
## [137] XML_3.99-0.9 fs_1.5.2
## [139] tensor_1.5 reticulate_1.24
## [141] splines_4.2.0 uwot_0.1.11
## [143] RcppRoll_0.3.0 spatstat.utils_2.3-0
## [145] plotly_4.10.0 xtable_1.8-4
## [147] jsonlite_1.8.0 poweRlaw_0.70.6
## [149] R6_2.5.1 pillar_1.7.0
## [151] htmltools_0.5.2 mime_0.12
## [153] glue_1.6.2 fastmap_1.1.0
## [155] DT_0.22 codetools_0.2-18
## [157] utf8_1.2.2 lattice_0.20-45
## [159] bslib_0.3.1 spatstat.sparse_2.1-1
## [161] leiden_0.3.10 gtools_3.9.2
## [163] GO.db_3.15.0 survival_3.2-13
## [165] limma_3.52.0 rmarkdown_2.14
## [167] docopt_0.7.1 munsell_0.5.0
## [169] GenomeInfoDbData_1.2.8 haven_2.5.0
## [171] reshape2_1.4.4 gtable_0.3.0
## [173] spatstat.core_2.4-2